Automatically Extracting Typical Syntactic Differences from Corpora

نویسندگان

چکیده

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automatically Extracting Typical Syntactic Differences from Corpora

We develop an aggregate measure of syntactic difference for automatically finding common syntactic differences between collections of text. With the use of this measure it is possible to mine for differences between for example, the English of learners and natives, or between related dialects. If formulated in advance, hypotheses can also be tested for statistical significance. It enables us to...

متن کامل

Extracting semantic relations from Portuguese corpora using lexical-syntactic patterns

The growing investment on automatic extraction procedures, together with the need for extensive resources, makes semi-automatic construction a new viable and efficient strategy for developing of language resources, combining accuracy, size, coverage and applicability. These assumptions motivated the work depicted in this paper, aiming at the establishment and use of lexical-syntactic patterns f...

متن کامل

Automatically enriching spoken corpora with syntactic information for linguistic studies

Syntactic parsing of speech transcriptions faces the problem of the presence of disfluencies that break the syntactic structure of the utterances. We propose in this paper two solutions to this problem. The first one relies on a disfluencies predictor that detects disfluencies and removes them prior to parsing. The second one integrates the disfluencies in the syntactic structure of the utteran...

متن کامل

Detecting Syntactic Substratum Effects Automatically in Interlanguage Corpora

This paper applies techniques to obtain an aggregate measure of syntactic distance between two varieties of English spoken by firstand second-generation Finnish Australians and examines the degree of what we call syntactic ‘contamination’ in the two. Our general goal is to detect the linguistic sources of the variation between the two groups and interpret the findings from (at least) two perspe...

متن کامل

Automatically Inducing Ontologies From Corpora

The emergence of vast quantities of on-line information has raised the importance of methods for automatic cataloguing of information in a variety of domains, including electronic commerce and bioinformatics. Ontologies can play a critical role in such cataloguing. In this paper, we describe a system that automatically induces an ontology from any large on-line text collection in a specific dom...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Literary and Linguistic Computing

سال: 2010

ISSN: 0268-1145,1477-4615

DOI: 10.1093/llc/fqq017